AITopics | counterfactual test

Collaborating Authors

counterfactual test

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

BPL: Bias-adaptive Preference Distillation Learning for Recommender System

Kang, SeongKu, Lian, Jianxun, Lee, Dongha, Kweon, Wonbin, Jang, Sanghwan, Lee, Jaehyun, Wang, Jindong, Xie, Xing, Yu, Hwanjo

arXiv.org Artificial IntelligenceOct-21-2025

Abstract--Recommender systems suffer from biases that cause the collected feedback to incompletely reveal user preference. While debiasing learning has been extensively studied, they mostly focused on the specialized (called counterfactual) test environment simulated by random exposure of items, significantly degrading accuracy in the typical (called factual) test environment based on actual user-item interactions. In fact, each test environment highlights the benefit of a different aspect: the counterfactual test emphasizes user satisfaction in the long-terms, while the factual test focuses on predicting subsequent user behaviors on platforms. Therefore, it is desirable to have a model that performs well on both tests rather than only one. In this work, we introduce a new learning framework, called Bias-adaptive Preference distillation Learning (BPL), to gradually uncover user preferences with dual distillation strategies. These distillation strategies are designed to drive high performance in both factual and counterfactual test environments. Employing a specialized form of teacher-student distillation from a biased model, BPL retains accurate preference knowledge aligned with the collected feedback, leading to high performance in the factual test. This enables the model to produce more accurate predictions across a broader range of user-item combinations, thereby improving performance in the counterfactual test. Real-world recommender systems form a feedback loop in which the systems' recommendations influence user behaviors, which in turn serve as training data for the system [1]. This feedback loop leads to the creation and amplification of various biases affected by multiple factors, including but not limited to user selection patterns, item exposure mechanism, and influence of public opinions [2], [3]. These biases progressively cause the training data to deviate from users' true preference, ultimately degrading the user satisfaction. SeongKu Kang is with the Department of Computer Science and Engineering, Korea University, Seoul, South Korea. Dongha Lee is with the Department of Aritifial Intelligence, Y onsei University, Seoul, South Korea, E-mail:donalee@yonsei.ac.kr. Jindong Wang is with William & Mary, Virginia, United States.

artificial intelligence, counterfactual test, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TKDE.2025.3619575

2510.16076

Country:

Asia > South Korea > Seoul > Seoul (0.44)
North America > United States > Virginia (0.24)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Faithfulness of LLM Self-Explanations for Commonsense Tasks: Larger Is Better, and Instruction-Tuning Allows Trade-Offs but Not Pareto Dominance

Siegel, Noah Y., Heess, Nicolas, Perez-Ortiz, Maria, Camburu, Oana-Maria

arXiv.org Artificial IntelligenceMar-17-2025

As large language models (LLMs) become increasingly capable, ensuring that their self-generated explanations are faithful to their internal decision-making process is critical for safety and oversight. In this work, we conduct a comprehensive counterfactual faithfulness analysis across 62 models from 8 families, encompassing both pretrained and instruction-tuned variants and significantly extending prior studies of counterfactual tests. We introduce phi-CCT, a simplified variant of the Correlational Counterfactual Test, which avoids the need for token probabilities while explaining most of the variance of the original test. Our findings reveal clear scaling trends: larger models are consistently more faithful on our metrics. However, when comparing instruction-tuned and human-imitated explanations, we find that observed differences in faithfulness can often be attributed to explanation verbosity, leading to shifts along the true-positive/false-positive Pareto frontier. While instruction-tuning and prompting can influence this trade-off, we find limited evidence that they fundamentally expand the frontier of explanatory faithfulness beyond what is achievable with pretrained models of comparable size. Our analysis highlights the nuanced relationship between instruction-tuning, verbosity, and the faithful representation of model decision processes.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.13445

Country:

Asia > Middle East > Republic of Türkiye (0.06)
Europe > France (0.04)
North America > United States > New York (0.04)
(23 more...)

Genre: Research Report > New Finding (0.47)

Industry:

Retail (1.00)
Media (1.00)
Health & Medicine (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Bias Beyond English: Counterfactual Tests for Bias in Sentiment Analysis in Four Languages

Goldfarb-Tarrant, Seraphina, Lopez, Adam, Blanco, Roi, Marcheggiani, Diego

arXiv.org Artificial IntelligenceMay-19-2023

Sentiment analysis (SA) systems are used in many products and hundreds of languages. Gender and racial biases are well-studied in English SA systems, but understudied in other languages, with few resources for such studies. To remedy this, we build a counterfactual evaluation corpus for gender and racial/migrant bias in four languages. We demonstrate its usefulness by answering a simple but important question that an engineer might need to answer when deploying a system: What biases do systems import from pre-trained models when compared to a baseline with no pre-training? Our evaluation corpus, by virtue of being counterfactual, not only reveals which models have less bias, but also pinpoints changes in model bias behaviour, which enables more targeted mitigation strategies. We release our code and evaluation corpora to facilitate future research.

artificial intelligence, computational linguistic, natural language, (16 more...)

arXiv.org Artificial Intelligence

2305.11673

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Germany (0.04)
Asia > Japan (0.04)
(6 more...)

Genre: Research Report (0.64)

Industry:

Government > Immigration & Customs (0.47)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.86)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.86)

Add feedback